xtmixed/mixed || panel data

Eva Lan

Join Date: Mar 2018

Posts: 23
#1

xtmixed/mixed || panel data

28 Apr 2018, 10:58

Hi everyone,

does anyone know how I can consider my panel data within a multilevel mixed model?
My panel data is xtset id t.

I had two questions about that.

First:

I tried to write:
xtset id t
xtmixed $ylist $xlist

But the reply doesn't seem to consider the panel data.

So I wrote:
xtset id t
xtmixed $ylist $xlist || id:, mle

It still doesn't seem to calculate the mixed effect for my panel data.

The commands:
xtmixed $ylist $xlist || id: || t:
xtmixed $ylist $xlist || id: || t:, mle
mixed y x || _all: R.id || _all: R.t
didn't work at all or the system hung.

What did I do wrong?

Actually I wanted to receive a basic overview of all of my variables and after that create multiple levels.

Second:

My data consists of three main time periods (each: containing multiple weeks).
In the second period there is a treatment.
For this treatment there are independent variables within the second time period and I would be interested in the effect of the treatment on the outcome and how the other independent variables (those variables which exist in every period such as age kids and so on) influence the treatment's effect on the outcome.
Is it correct to create a "Random-effects equations" for the case if the person got treated or not and an additional "Random-effects equations" for the three different periods? Do I receive the information I look for from it?
Originally I tried to use the DiD-estimator, but I couldn't consider any additional independent variables or metric variables. This was why I was told to try out the mixed effects model instead.

For example, my Stata 13.0 isn't working, because of the "mixed $ylist $xlist || _all: R.id || _all: R.t" command right now.
Thank you very much for your help!!

Best wishes,
Eva
Tags: mixed, panel, panel data, xtmixed, xtset
Richard Williams

Join Date: Apr 2014

Posts: 5043
#2

28 Apr 2018, 12:06

I had a student who was getting weird messages doing mixed models in Stata 13. The first thing I did was redo the analysis in Stata 15 and all was well. So I told him to go to a lab and use Stata 15. I don't know if that will solve any of your problems but you might try it if possible.

Otherwise, answering your Qs is a bit hard because you aren't showing both commands and output, nor are you proving a replicable example. See pt #12 of the FAQ.

Some simple comparisons of me and xt models can be found at

https://www3.nd.edu/~rwilliam/xsoc73994/Multilevel.pdf

-------------------------------------------
Richard Williams
Professor Emeritus of Sociology
University of Notre Dame
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://academicweb.nd.edu/~rwilliam/
1 like
Comment
Eva Lan

Join Date: Mar 2018

Posts: 23
#3

28 Apr 2018, 14:38

Thank you very much for your response!

After reducing the data to test the commands again they worked. It works, although the results don't make any sense and aren't significant due to the small example size for sure. With all of my data the commands don't work. Might that be the similar problem to the one of your student?
I only have access to stata 13.0. I don't know what I am doing wrong.

Here the variables are divided into timevariate and -invariate now. The timevariate variables are to be found in the fixed effects model. The timeinvariate variables are to be found in the random effects model:

. global xlist age gender kids ad conn treat

. global ylist dv

. xtset id t
panel variable: id (unbalanced)
time variable: t, 1 to 27
delta: 1 unit

xtreg $ylist ad conn, fe

Fixed-effects (within) regression Number of obs = 1161
Group variable: id Number of groups = 43

R-sq: within = 0.0024 Obs per group: min = 27
between = 0.0397 avg = 27.0
overall = 0.0051 max = 27

F(2,1116) = 1.33
corr(u_i, Xb) = -0.7858 Prob > F = 0.2648

------------------------------------------------------------------------------
dv | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ad | -.0864052 .1129356 -0.77 0.444 -.3079953 .1351849
conn | .00661 .0043724 1.51 0.131 -.001969 .0151891
_cons | -.2163903 .3322991 -0.65 0.515 -.8683917 .4356111
-------------+----------------------------------------------------------------
sigma_u | .575178
sigma_e | .86761727
rho | .30530911 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(42, 1116) = 4.21 Prob > F = 0.0000

. xtreg $ylist age gender kids treat, re
note: gender omitted because of collinearity

Random-effects GLS regression Number of obs = 1161
Group variable: id Number of groups = 43

R-sq: within = 0.0000 Obs per group: min = 27
between = 0.0470 avg = 27.0
overall = 0.0070 max = 27

Wald chi2(3) = 1.92
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.5885

------------------------------------------------------------------------------
dv | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | .0088166 .0082407 1.07 0.285 -.0073348 .024968
gender | 0 (omitted)
kids | .0242646 .0621099 0.39 0.696 -.0974685 .1459977
treat | .1369685 .1312641 1.04 0.297 -.1203043 .3942414
_cons | -.2781332 .4031825 -0.69 0.490 -1.068356 .51209
-------------+----------------------------------------------------------------
sigma_u | .3236048
sigma_e | .86787372
rho | .12206205 (fraction of variance due to u_i)
------------------------------------------------------------------------------

When running e.g. the mixed y x || _all: R.id || _all: R.t command (or xtmixed $ylist $xlist || id: || t:;
xtmixed $ylist $xlist || id: || t:, mle) the results differentiate from the results above:

mixed dv $xlist || _all: R.id || _all: R.t
note: gender omitted because of collinearity

Performing EM optimization:

Performing gradient-based optimization:

Iteration 0: log likelihood = -1511.3731
Iteration 1: log likelihood = -1511.2903
Iteration 2: log likelihood = -1511.29
Iteration 3: log likelihood = -1511.29

Computing standard errors:

Mixed-effects ML regression Number of obs = 1161
Group variable: _all Number of groups = 1

Obs per group: min = 1161
avg = 1161.0
max = 1161

Wald chi2(5) = 8.31
Log likelihood = -1511.29 Prob > chi2 = 0.1398

------------------------------------------------------------------------------
dv | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | .0092901 .0075981 1.22 0.221 -.005602 .0241821
gender | 0 (omitted)
kids | .0015879 .0625992 0.03 0.980 -.1211043 .1242802
ad | .018738 .0140429 1.33 0.182 -.0087856 .0462616
conn | .0054262 .0030043 1.81 0.071 -.0004621 .0113145
treat | .1273217 .1350951 0.94 0.346 -.1374597 .3921032
_cons | -.4228962 .4113943 -1.03 0.304 -1.229214 .3834217
------------------------------------------------------------------------------

------------------------------------------------------------------------------
Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
-----------------------------+------------------------------------------------
_all: Identity |
var(R.id) | .081998 .0237324 .0464989 .1445984
-----------------------------+------------------------------------------------
_all: Identity |
var(R.t) | .0025258 .0055673 .0000336 .1899388
-----------------------------+------------------------------------------------
var(Residual) | .7494409 .0320912 .6891103 .8150534
------------------------------------------------------------------------------
LR test vs. linear regression: chi2(2) = 61.06 Prob > chi2 = 0.0000

Note: LR test is conservative and provided only for reference.

Maybe I mixed sth up.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30355
#4

28 Apr 2018, 16:01

I have no thoughts to add on why your model will not run with your full data.

Turning to the difference in results you get with

Code:

xtreg $ylist age gender kids treat, re

and

Code:

mixed dv $xlist || _all: R.id || _all: R.t

the models are quite different and there is no reason to expect that the results will be even slightly similar. Two major differences are:

1. The second model contains variables ad and conn that do notn appear in the first.
2. The second model contains random effects for time, crossed with those for id. The first model contains no effects for time at all.

When you change even one variable in a model, everything can change drastically, both in magnitude and sign. When you make major changes involving potentially dozens of variables, as here, it is miraculous that there is even a vague resemblance between the results of the two models.
Comment
Eva Lan

Join Date: Mar 2018

Posts: 23
#5

29 Apr 2018, 11:51

Thank you for your feedback.

I was told that it had to be similar and wondered myself about that. But because I am not that used to mixed-models and couldn't find any other statements I simply believed it.
So I will try out a new method.
Maybe I can find a solution to that.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30355
#6

29 Apr 2018, 13:10

Well, it isn't clear to me what you're trying to accomplish.

The direct translation of

Code:

xtreg $ylist age gender kids treat, re

to -mixed- is

Code:

mixed $ylist age gender kids treat || id:

These estimate exactly the same model. The estimation algorithms are different, so the results may differ, but only slightly.,

If you explain more clearly what you're looking for, someone might be able to tell you how to code it for Stata.
Comment
Eva Lan

Join Date: Mar 2018

Posts: 23
#7

29 Apr 2018, 13:37

The reason the variables are divided in the first two commands is that the time-invariate variables age gender kids and treat got omitted within the fixed effects models which only includes time-variate variables. Thus I separated the time-invariate variables to the random effect model. The fixed-effects-model got confirmed by the hausman test.
The mixed effect model is supposed to make the division of the variables redundant and combine the whole variables into one model without omitting any of the variables.

Last edited by Eva Lan; 29 Apr 2018, 13:40.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30355
#8

29 Apr 2018, 13:43

The mixed effect model is supposed to make the division redundant and combine the whole variables into one model without omitting any of them.

That is correct. -xtreg, re- is just a special case, the simplest possible case, of a mixed-effects model: it allows only two levels in the model and it allows random intercepts, but not random slopes. Any model you can fit with -xtreg, re- can also be fit with -mixed-, though the reverse is not true. Because -xtreg, re- is less general, it runs a bit faster than -mixed-, and it also, in my experience, is less likely to encounter convergence problems with difficult cases. So, in general, if you don't need random slopes and have only two levels, I recommend using -xtreg, re-. The results should be the same, in any case, but they will come a bit more easily.
Comment
Eva Lan

Join Date: Mar 2018

Posts: 23
#9

29 Apr 2018, 13:53

Thank you, now it starts to make more sense.
So you would suggest to run -xtreg,re- for all of the variables instead of the Mixed-Effects Model. But will the results for the time-variate variables be acceptable as valid even if the hausman test would recommend to use the Fixed-Effects Model?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30355
#10

29 Apr 2018, 15:06

So you would suggest to run -xtreg,re- for all of the variables instead of the Mixed-Effects Model.

Yes, it's just simpler to code and you are less likely to run into a convergence problem. Other than that, however, it's the same model and the results should be the same either way.

But will the results for the time-variate variables be acceptable as valid even if the hausman test would recommend to use the Fixed-Effects Model?

Well, I'm not a big fan of the Hausman test, or any other statistical test as the basis for selecting a model. But I understand that in some disciplines, it is something like heresy to use a random-effects model when Hausman says fixed-effects is better. To do the Hausman test, you need to have the same variables in both the fixed and random-effects models; the test is not valid if the models differ on what variables are included. (In fact, last time I tried to use -hausman- or -xthausman- I don't think it even allows you to run the test if the models differ.)

It sounds like your concern is that you want estimates of the effects of time-invariant variables, but you can't get them from a fixed effects model, so you would like to use a mixed-effects model to get those, and you are concerned that you will be called out for going against what the Hausman test tells you. Well, that is a dilemma for which there is no solution. And using -mixed- or -xtreg, re- to fit the model makes no difference.

You are looking for two things that mathematically cannot both happen:

1. Guarantee of consistent estimation (fixed effects model does this, random effects does not.)
2. Ability to estimate effects of time-invariant variables (fixed effects cannot do this).

So you have to make a choice. You can't have it both ways, so you have to decide which of these desires you will sacrifice. Or, you can try to defend your random effects model against inconsistency by including more covariates, to improve the chances that the random effects really will be independent of the bottom level predictors. With 43 groups you could add another predictor or two to your model if need be. Or you might consider whether estimating the effects of age, gender, and kids is really central to your research goals. Do you really need those estimates, or were those variables just thrown in as "the usual suspects" to be adjusted for? If the latter, then just stick with fixed effects and forget about those variables.

In the end, it is important that your approach be acceptable to your intended audience. I am not among that audience. So I think at this point you need to discuss this with colleagues in your own discipline.
1 like
Comment
Eva Lan

Join Date: Mar 2018

Posts: 23
#11

29 Apr 2018, 15:24

Thank you very much for your detailed feedback! It really helps me a lot!
I am going to consider all of your points in my further steps and adapt my approach.
Comment
maria costanza soldini

Join Date: May 2018

Posts: 6
#12

24 May 2018, 02:40

Dear statlists,
I have a data set of 154 dental implants and 71 patients. i have recorded bone change in 4 times (year1,2,3,4) .
frist i have reshape my data set in long: reshape long level change, i(CODICEIMP) j(time)
then, i declare a panel data set
:
xtset CODICEIMP time

I have predictors at patient level (sex, age, smoking) and implant level (type of surgery, depth, porsthesis,diameter).
i want to predict my outcome (bone change) during the time on the basis of my predictors at both leve (implant and patients) so i need a random slope model.

I set my model null model :
mixed change i.time ||CODICEPZ: time, cov(unstruc) || CODICEIMP: time, cov(unstruc)mle

but if i add my first predictor at implant level
: mixed change i.time, cov(unstruc) ||CODICEPZ: time, cov(unstruc) || CODICEIMP: time DEPTH, cov(unstruc)mle

then STATA gives to me:
Hessian is not negative semidefinite
conformability error

I need a nested model beacuse of my outcome of interest (bone change related to implant and patients variables) . how can i solve my problem?

Thankyou!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30355
#13

24 May 2018, 08:31

Your models are incorrect, even the one that apparently ran without difficulty. Any variable for which a random slope (at any level) is estimated, must also appear, in identical form, among the bottom-level fixed effects. So even your first model is problematic because you used i.time in the bottom level but then tried to assign a random slope to time. That model is mis-specified and its results are meaningless. Similarly, when you add the random slope for DEPTH, you must also include DEPTH in the fixed effects. So the models should be:

Code:

mixed change time ||CODICEPZ: time, cov(unstruc) || CODICEIMP: time, cov(unstruc)mle mixed change time DEPTH, cov(unstruc) ||CODICEPZ: time, cov(unstruc) || CODICEIMP: time DEPTH, cov(unstruc)mle

Whether these problems caused your Hessian problem, I cannot say. But even had these models run uneventfully they would have produced incorrect results.

Now, these models do not handle time in the same way you originally specified. In particular, these models treated time as a continuous variable linearly related to outcome rather than as four discrete time periods whose relationships to outcome are arbitrary. If you really need to model time as discrete and also have random slopes for the discrete time variables, then you cannot, unfortunately, do that with factor-variable notation as the random-effects part of -mixed- does not support it. Instead you must generate your own indicators for the time variables and then include them in both the fixed effects and at each level where you want random slopes for them.

Let me also point out that if you do go for the discrete time approach, you are stretching your data very then, perhaps to the breaking point. You will be estimating 9 parameters for time effects (3 time periods other than the reference * 3 levels) and then you also estimate 18 parameters for the covariances of those slopes with each other (2 levels * 3 * 3) plus 6 more for the covariances of those slopes with the random intercepts (2 levels * 3). That's 33 parameters estimated for time in a data set with only 154 observations among 71 patients! Not to mention the parameters for all your other variables. Even if you decide to model time as a single continuous variable, all of those random-slope covariances are going to start to bite as you add in the random slopes on all of those other variables you are interested in. The number of covariances required with -cov(unstruct)- rises as the square of the number of random slopes being estimated. It blows up quickly.

If you really need all of those parameters, you are going to need a much larger data set to get meaningfully precise estimates of them. My guess, though, is that you have no real need for the numerous covariances implied by -cov(unstruct)- and would do just fine with -cov(exch)- (which is the more typical situation for repeated observations anyway).
Comment
maria costanza soldini

Join Date: May 2018

Posts: 6
#14

28 May 2018, 00:00

thankyou for your help. I will try to transform my time variable in continuos or i will look for another model...
Comment

Announcement

xtmixed/mixed || panel data

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment